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ABSTRACT 

An initialization strategy, tailored to the prediction of the Madden-Julian oscillation (MJO), is evaluated 
using the Goddard Earth Observing System Model, version 5 (GEOS-5), coupled general circulation model 
(CGCM). The approach is based on the empirical singular vectors (ESVs) of a reduced-space statistically 
determined linear approximation of the full nonlinear CGCM. The initial ESV, extracted using 10 years 
(1990-99) of boreal winter hindcast data, has zonal wind anomalies over the western Indian Ocean, while the 
final ESV (at a forecast lead time of 10 days) reflects a propagation of the zonal wind anomalies to the east 
over the Maritime Continent — an evolution that is characteristic of the MJO. 

A new set of ensemble hindcasts are produced for the boreal winter season from 1990 to 1999 in which the 
leading ESV provides the initial perturbations. The results are compared with those from a set of control 
hindcasts generated using random perturbations. It is shown that the ESV-based predictions have a system- 
atically higher bivariate correlation skill in predicting the MJO compared to those using the random per- 
turbations. Furthermore, the improvement in the skill depends on the phase of the MJO. The ESV is particularly 
effective in increasing the forecast skill during those phases of the MJO in which the control has low skill 
(with correlations increasing by as much as 0.2 at 20-25-day lead times), as well as during those times in 
which the MJO is weak. 


1. Introduction 

Since its discovery about three decades ago, the 
Madden-Julian oscillation (MJO; Madden and Julian 
1971) has been the subject of numerous studies to better 
characterize its behavior and ascertain the underlying 
physical mechanisms at play (e.g., see the review by Zhang 
2005). Also, in recognition of the important impact of 
the MJO on weather and climate variability on subsea- 
sonal time scales (Yasunari 1979; Takayabu et al. 1999; 
Bergman et al. 2001; Kessler 2001; Wheeler and McBride 
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2005), there is an increasing emphasis on efforts to assess 
its predictability and develop better prediction methods. 
Statistical methods have had some success in predicting 
the MJO (Waliser et al. 1999; Lo and Hendon 2000; 
Mo 2001; Jones et al. 2004; Webster and Hoyos 2004; 
Maharaj and Wheeler 2005; Jiang et al. 2008) and pro- 
vide an important benchmark for dynamically based ap- 
proaches that, owing to the poor simulation of the MJO, 
have generally failed to improve upon such simpler ap- 
proaches (Slingo et al. 1996; Waliser et al. 2003a, b; Zhang 
2005; Waliser 2006). However, recent improvements in 
general circulation models (GCMs) including a better sim- 
ulation of subseasonal tropical variability (Slingo 2005; 
Wu et al. 2002; Seo et al. 2009a), have renewed interest 
in using GCMs to predict the MJO (Vitart et al. 2007; 
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Seo et al. 2005, 2009b; Kang and Kim 2010; Rashid 
et al. 2011). 

The skill of MJO predictions has been assessed using 
various air-sea coupled models. For example, Vitart et al. 
(2007) showed that the European Centre for Medium- 
Range Weather Forecasts (ECMWF) coupled model is 
skillful in predicting the evolution of the MJO up to about 
14 days, and Seo (2009) and Seo et al (2009) showed 
that the National Centers for Environmental Prediction 
(NCEP)’s operational coupled Climate Forecast System 
(CFS) model exhibits useful skill out to 2 and 3 pentads 
when the initial MJO convection is located over the Mari- 
time Continent and the Indian Ocean, respectively. In ad- 
dition, Rashid et al. (2011) found that the MJO can be 
predicted using the Predictive Ocean-Atmosphere Model 
for Australia (POAMA) with 10 ensemble members out 
to about 21 days. Kang and Kim (2010) compare the skill 
of MJO predictions using both statistical models and the 
Seoul National University (SNU) CGCM and concluded 
that the limit of skillful predictions made with statistical 
(multivariate regression) models based on the Real-Time 
Multivariate Madden-Julian oscillation (RMM) index oc- 
curs at days 16-17, while it occurs at about 20 days for the 
GCM. 

As a part of the ensemble prediction system, it is 
known that the prediction skill is dependent the struc- 
ture of the ensemble perturbations (Toth and Kalnay 
1993). However, while there has been progress in pre- 
dicting the MJO associated with model improvements, 
there has been less emphasis on the issue of how to best 
(from the stand point of predicting the MJO) perturb the 
initial conditions of the ensemble prediction system. For 
example, in several studies the predictions consist of 
only a single ensemble member — mainly because of the 
heavy computational burden of generating hindcasts 
over a period of two decades (Kang and Kim 2010), or, 
the initial perturbation is not focused on the MJO. Vitart 
et al. (2007) and Vitart and Molteni (2010) used singular 
vectors (an optimal perturbation method) to perturb the 
extratropics (north of 30°N), since their focus was on 
weather forecasts. Rashid et al. (2011) performed en- 
semble predictions in which the atmospheric “pertur- 
bations” are based on the atmospheric analyses 6 h 
before the start time. The method is called lagged av- 
eraged forecasting (LAF; Hoffman and Kalnay 1983) 
and provides an alternative to adding random pertur- 
bations. It turns out that this method partly captures the 
features of so-called optimal perturbations. 

To extract the optimal perturbations for ensemble 
MJO prediction, Liess et al. (2005) adopted a breeding 
approach to generate initial perturbations suitable to 
the MJO prediction within a perfect model framework. 
They defined the rescaling time as the pentad at which 


detection of the modes that grow fastest on the intra- 
seasonal time scale is achieved, without being influenced 
too strongly by higher-frequency weather instability. 
Similarly, Chikamoto et al. (2007) successfully extracted 
the tropical-bred vectors associated with the MJO with 
a one-day rescaling interval. The extracted bred vectors 
show eastward propagation, which begins over the Indian 
Ocean and becomes prominent over the western Pacific. 
This feature resembles the MJO, suggesting that breeding 
methods are suitable for extracting optimal perturbations 
for MJO prediction. However, that study did not per- 
form an ensemble of MJO predictions with their pertur- 
bations to fully validate the importance of optimal 
perturbation for improving MJO prediction skill. 

Motivated by the above facts, this study re-examines 
the optimal perturbations for MJO prediction using an 
empirical singular vector (ESV; Kug et al. 2010, 2011; 
Ham and Kang 2010) approach, applied to an ensemble 
prediction system for boreal winter season. The basic 
concept of the ESV method is similar to the singular 
vector method, which is widely used in weather forecasts 
(Molteni and Palmer 1993; Palmer et al. 1994). The 
main difference between ESV and the singular vector 
method is that the ESV method does not require a lin- 
earized version of a GCM. Instead, the ESV is calcu- 
lated using historical prediction data — something that 
is becoming more widely available for GCMs as an in- 
tegral component of model evaluation and calibration. 
Therefore, the ESV method is more easily applicable 
to various GCMs without the need for a linearized 
version of the model. In addition, unlike the breeding 
method, it is relatively easy to extract optimal pertur- 
bations related to the MJO without additional model 
integrations during the forecasts. 

This paper is organized as follows. In section 2, 
the Goddard Earth Observing System Model, version 5 
(GEOS-5) CGCM developed at the National Aeronautics 
and Space Administration (NASA) Global Modeling and 
Assimilation Office (GMAO) is briefly described, and the 
initialization for the hindcasts and the empirical singular 
vector method is introduced. Section 3 describes the ex- 
perimental design of the hindcasts and the forecast skill 
improvements associated with the use of the ESVs. The 
discussion and brief summary are included in section 4. 

2. Model and hindcast experiments 

a. NASA/GMAO GEOS-5 coupled GCM 

The model used in this study is the NASA/GMAO 
GEOS-5 Coupled General Circulation Model (CGCM). 
The ocean component of NASA/GMAO GEOS-5 
CGCM is the Modular Ocean Model version 4 (MOM4) 
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code (Griffies et al. 2004). The ocean model uses a 5-grid 
finite difference treatment of the primitive equations of 
motion, Boussinesq and hydrostatic approximations 
in spherical coordinates, and covers the global oceans 
with realistic coastlines and bathymetry. The resolution 
is 50 vertical levels and a 1° X 1° horizontal grid tele- 
scoping to V 3 0 meridional spacing near the equator. The 
vertical grid spacing is a constant 10 m over the top 
225 m. The /<- Profile Parameterization (KPP) vertical 
mixing scheme is used in this model. 

The atmospheric component of the GEOS-5 model 
used here has 72 vertical levels and 2° latitude by 2.5° 
longitude grid spacing. The dynamic core is based on 
a finite-volume method (Lin 2004). The convective pa- 
rameterization is the relaxed Arakawa-Schubert (RAS) 
scheme (Moorthi and Suarez 1992). The large-scale con- 
densation scheme is based on a PDF of total water as in 
Smith (1990) or Rotstayn (1997). The free atmospheric 
turbulent diffusivities are based on the gradient Richard- 
son number. For the boundary layer, the Louis (1982) 
scheme is implemented in stable situations with no or 
only weak cooling in the planetary boundary layer 
(PBL) cloud. In addition, the Lock et al. (2000) scheme 
is used for unstable or cloud-topped PBLs. GEOS-5 
incorporates two gravity wave drag parameterizations, 
an orographic gravity wave drag formulation (McFarlane 
1987), and a formulation for nonorographic waves based 
on Garcia and Boville (1994). The Catchment Land 
Surface Model from Koster et al. (2000) is coupled to 
atmospheric model. Air-sea fluxes are exchanged at 
every time step. More details about the GEOS-5 atmo- 
spheric model are provided in Rienecker et al. (2007). 

b. Initialization for the MJO forecasts 

We perform the MJO hindcasts from 1990 to 1999 us- 
ing the GEOS-5 CGCM. The atmosphere -ocean initial 
conditions for this period are obtained by constraining 
the atmospheric component (i.e., zonal and meridional 
wind, temperature, specific humidity, and surface pres- 
sure) of the CGCM with the Modern-Era Retrospec- 
tive Analysis for Research and Applications (MERRA; 
Rienecker et al. 2011) using an incremental analysis up- 
date procedure (IAU; Bloom et al. 1996) beginning in 
1979. We refer to this as a “replay” approach since it 
makes use of an existing atmospheric analysis. 1 Even 


1 The IAU method was originally developed to be part of a data 
assimilation procedure to reduce the shocks of data. The replay 
approach works in the same way except that it uses an existing 
analysis (in this case MERRA) to compute the increments. We 
note that the IAU is different from the traditional nudging ap- 
proach in that the IAU filters the analysis increments only and not 
the full background state (Bloom et al. 1996). 


though the replay approach, directly, only constrains 
the atmospheric component of the coupled model, the 
upper ocean is also adjusted to the observed values via 
air-sea coupling, even without the assimilation of any 
subsurface ocean observations. In particular, it is found 
that the subsurface tropical temperature anomalies from 
the surface to 300 m in the replay simulation are quite 
similar to those obtained from ocean reanalyses (e.g., 
Behringer and Xue 2004; Carton and Giese 2008) (not 
shown). 

With these initial conditions, we perform 30-day 
MJO hindcasts every day for the period 1990-99. 2 This 
“control (CNTL)” set of predictions is used to compute 
the ESVs and also serves as part of the benchmark 
against which to evaluate the MJO prediction skill of 
hindcasts that use ESV-based perturbations (see next 
section). 

c. Description of the ESV 

The ESV method used in this study follows the pro- 
cedure described in Kug et al. (2010, 2011). The main 
feature that distinguishes ESV from the conventional 
singular vector approach is that the linear operator is 
derived empirically from a large number of hindcasts. 
Using matrix multiplication, the empirical linear oper- 
ator is derived using many initial and final states as 
follows: 

L = YX T (XX T ) _1 , (1) 

where each column of X and Y contains the state vectors 
of the hindcasts at the initial and final time, respectively. 

In this study, the linear operator is obtained in a re- 
duced space through a combined empirical orthogonal 
function (CEOF) analysis using equatorially averaged 
(15°S-15°N) 850 hPa, 200-hPa zonal wind, and 200-hPa 
velocity potential initial condition and hindcast data 
during boreal winter (November-April) from 1990 to 
1999. The total number of hindcasts used to obtain the 
linear operator is 1812. To remove the interannual var- 
iability, the anomaly related to the MJO is calculated by 
subtracting the seasonal cycle and the previous 120-day 
mean (Wheeler and Hendon 2004; Rashid et al. 2011). 
Note that the 120-day-mean value for the nth day fore- 
cast is obtained by using n days of forecast output and 
120-n days of observations. Then, the anomaly fields 
are normalized by the square root of the longitudinally 
(0°-360°E) averaged variance. CEOFs are computed 


2 This is in fact part of a larger suite of 6-month hindcasts, al- 
though for the purposes of this study we focus on the first 30 days of 
the hindcasts for the period 1990-99. 
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from the history of initial state vectors [i.e., X in Eq. (1)], 
and only the first 5 leading modes are retained. Simi- 
larly, CEOFs are computed for the 10-day lead, daily- 
averaged forecast data [Y in Eq. (1)], but in this case only 
the two leading CEOFs are retained. Therefore, the 
dimension of the linear operator (L) is 2 X 5. Finally, we 
formulate a separate linear operator for each of the 
initial MJO’s eight phases. 

By retaining only the leading CEOFs in the linear 
operator we filter out modes that tend to be noisy and 
difficult to interpret, and thereby minimize statistical 
sampling issues and overfitting. While this is a rather 
strong filter, we note that the 5 leading CEOFs explain 
about 70% of MJO variability (i.e., the variability 
within the 20-90-day period). It is also not inconsistent 
with Wheeler and Hendon (2004), who show that the 
two dominant CEOFs explain about 60% of the total 
MJO variability (Wheeler and Hendon 2004). We ac- 
knowledge, however, that the number of modes that 
need to be retained to best represent the linear operator 
for seasonal prediction applications remains an open 
question (Kug et al. 2010, 2011). 

One difference in our characterization of the MJO 
compared with Wheeler and Hendon (2004) is that we 
use 200-hPa velocity potential as a proxy for convection 
instead of outgoing longwave radiation (OLR). We do 
this primarily because the velocity potential field is 
more closely linked to the model’s state variables than 
is OLR (since we need to initialize the state variables). 
This is important because OLR is a diagnostic quantity 
that is not assimilated, and so it is likely to be less ac- 
curate and suffer from larger bias than quantities more 
directly linked to the wind field. We also note that ve- 
locity potential is a commonly predicted variable for 
showing MJO propagation in many studies (Molinari 
et al. 1997; Waliser et al. 2003b; Tanaka et al. 2004; 
Waliser et al. 2006). The upper-level velocity potential 
of course cannot fully reflect the complex pattern of 
convection, especially when the convective activity re- 
lated to the MJO passes over the Maritime Continent. 
We do, however, allow for some additional flexibility 
at the large scales (within the velocity potential repre- 
sentation) by formulating the linear operator to be 
a function of the phase of the MJO. 

After obtaining the linear operator, we perform a 
singular-value decomposition (SVD) to obtain the sin- 
gular vectors. The fastest growing singular vector (i.e., 
corresponding singular values are greater than one) 
is defined as the ESV mode (Kug et al. 2010, 2011). It 
was found that the largest singular value is greater than 
one for all initial MJO phases, indicating that the modes 
will grow when they are used as the initial perturbations. 
For example, the singular value is largest at MJO phase 


Initial & Final ESV 



Fig. 1. The equatorially averaged (15°S-15°N) initial (black 
lines) and final (gray lines) ESV of zonal wind at 200 hPa (U200, 
solid line) and zonal wind at 850 hPa (U850, dotted line) at MJO 
phase 4. Note that the values are normalized. 

4 with a value of 1.11 and lowest at MJO phase 2 with 
a value of 1.02. Hereafter, the left (right) singular vector 
with the largest singular value is denoted as the initial 
(final) ESV. Sensitivity tests with different forecast 
lengths (i.e., the ESV is computed using 5-, 10-, and 
20-day lead forecasts) show that the singular values gen- 
erally increase as the forecast lead time is increased. 

Because the time periods used to calculate the ESVs 
overlap with those used for the hindcasts, this method- 
ology has not been fully tested as to whether it is ap- 
plicable to an operational forecasting system. We have, 
however, done some cross validation, primarily to assess 
the stability of the ESV patterns. In particular, we have 
recomputed the ESVs for subsets of the data (by re- 
moving one year at a time) and found that the ESV 
structures are quite stable (not shown). 

Figure 1 shows the initial and final ESVs for MJO 
phase 4. The initial ESV shows positive peaks in the 
200-hPa zonal wind over the western Indian Ocean, 
while it shows negative peaks over South America. 
The initial ESV of the 850-hPa zonal wind shows a 
pattern opposite to that of the 200-hPa zonal wind, 
which implies a baroclinic structure for the initial ESV. 
The final ESV shows a positive peak in the 200-hPa 
zonal wind over the Maritime Continent and a negative 
peak over the far-eastern Pacific. Similar to the initial 
ESV, there is a clear baroclinic structure in the upper- 
and low-level winds. It is interesting that the wind signals 
of the initial ESV over the Indian Ocean appear to have 
propagated to the east, and the wind signals of final 
ESV are over the maritime continents. This eastward- 
propagating feature of the ESV is consistent with the 
characteristics of optimal initial perturbations found in 
previous studies using the breeding method (Chikamoto 
et al. 2007). Sensitivity tests with different forecast 
output show that the ESV using a 5-day lead forecast 
produces a zonal phase difference between the initial 
and final ESV that is almost zero because of the short 
forecast lead time. Also, the final ESV based on a 20-day 
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Fig. 2. The spatial pattern of initial ESV of (a) 200-hPa zonal 
wind (U200), (b) 850-hPa zonal wind (U850), (c) 850-hPa specific 
humidity (q850), and (d) 850-hPa air temperature (T850) at MJO 
phase 4. The units of zonal winds, specific humidity, and air tem- 
perature are m s~\ 10 5 g/kg, and °C, respectively. 
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Longitude 

Fig. 3. The equatorially averaged (15°S-15°N) initial (shading) 
and final (contour) ESV of 200-hPa zonal wind with respect to 
MJO phases. Note that the values are normalized. 


lead forecast time shows a zonal wavenumber-2 struc- 
ture, which is also dominant at 20-day lead forecasts in 
the CNTL predictions (see CNTL in Fig. 12d). 

In addition to the ESV pattern for U200, U850, and 
VP200, the related patterns for all prognostic variables 
are obtained using linear regression to generate initial 
perturbations that are well balanced among all prog- 
nostic variables. That is, we construct the regressed 
patterns of other variables related to the five dominant 
CEOFs, where the predictors are the associated princi- 
ple components (PCs). After the ESV is obtained, the 
spatial pattern of initial perturbations for other vari- 
ables is calculated by multiplying each ESV mag- 
nitude by the regressed patterns. Figure 2 shows the 
spatial pattern of the initial ESV for the regressed zonal 
winds, specific humidity, and temperature at MJO phase 
4. Consistent with Fig. 1, the upper- and lower-level 
zonal winds have anomalies over the equatorial Indian 
Ocean and the far-eastern Pacific. In addition, there are 
strong negative anomalies of the 200-hPa zonal wind 
over the off-equatorial central Pacific. The ESV anom- 
alies in circulation appear to be dynamically linked to 
the moisture fields in that, over the far-western Indian 
Ocean, there is positive moisture anomaly where there is 
low-level convergence. Also, there is a positive temper- 
ature anomaly over Africa possibly generated by zonal 
warm advection because of the easterly wind anomalies. 


These results suggest that the linear regression success- 
fully captures the dynamical linkages between the vari- 
ables. 

Figure 3 shows the equatorially averaged (15°S-15°N) 
ESVs of the 200 hPa zonal wind for all MJO phases. The 
initial ESVs for all MJO phases show robust anomalies 
over the Indian Ocean and South America, while the 
ESVs at MJO phase 7 and 8 shows negative anomalies 
over the Maritime Continent. This shows that the spatial 
pattern of the ESVs is not sensitive to the phase of the 
MJO. The location of the maximum positive values of 
the upper-level wind in the final ESVs (near 110°E for 
phases 3-5) is indicative of eastward propagation from 
the Indian Ocean to the Maritime Continent. The lon- 
gitudinal location of the negative anomalies of the final 
ESVs is similar to that of initial ESVs, or shifted west- 
ward slightly, indicating that the eastward propagating 
signal is limited to the Indian Ocean where the MJO 
appears to initiate. 

After obtaining the spatial pattern of the initial 
perturbations for all variables, the magnitude of the 
perturbation needs to be defined. While this is some- 
what arbitrary, we define the magnitude of the initial 
ESV based on the 200 hPa zonal wind. In parti- 
cular, the root-mean-square (RMS) magnitude of the 
equatorially-averaged initial ESV over the globe is set to 



15 July 2012 


HAM ET AL. 


4937 


10% of the RMS of the equatorially-averaged and fil- 
tered (20-90 days) 200 hPa zonal wind anomaly over the 
globe. The filter consists of a LANCZOS bandpass filter 
(using 45 weights; Duchon 1979), and is applied to the 
daily-mean zonal-wind anomalies at 200 hPa. The mag- 
nitudes of the initial perturbations of the all variables are 
obtained in the same way. 

3. Results 

a. Experimental design for MJO forecasts 

Here we take advantage of a number of existing 
GEOS-5 CGCM hindcasts to form our baseline set of 
experiments. We focus in particular on a set of 30-day 
hindcasts initialized for the 10 winters (November- 
April) of 1990-99 with an interval of 10 days between 
the predictions. That is, the model predictions are made 
every 10 days from 1 November to 30 April, to produce 
a total of 180 forecasts. The 10-day intervals were chosen 
to minimize the impact of any correlations between the 
starting dates. 

Two sets of predictions were generated, denoted 
hereafter by ESV and CNTL. The ESV predictions con- 
sist of two ensemble members whose initial states are 
obtained by adding and subtracting the ESV to the 
baseline initial conditions (i.e., the initial states obtained 
from the replay approach described earlier). The skill of 
the ESV-based predictions is compared with the skill 
of a set of CNTL predictions using random perturba- 
tions. To generate random perturbations, we utilized the 
regressed patterns of the various prognostic fields (de- 
scribed earlier) associated with the five dominant CEOFs. 
The weighting of each CEOF (the PC) is chosen to be a 
random number with a uniform distribution between —1 
and 1. The initial perturbations for all variables are then 
obtained by multiplying these random numbers and the 
regressed patterns to produce what appear to be dy- 
namical balanced perturbations. Just as for the ESV, the 
magnitude of the random perturbation is scaled to be 
10% of the root-mean-square (RMS) magnitude of the 
equatorially averaged and filtered (20-90 days) 200-hPa 
zonal wind anomaly over the globe. 

We note that the time-averaged equatorial-mean spread 
(standard deviation) of the control predictions with ran- 
dom perturbation is about 65% of the spread of ESV 
predictions. That is to be expected in view of the relatively 
fast-growing nature of the ESV perturbations, but it nev- 
ertheless indicates that the spread of the random pertur- 
bation is reasonable and that the control predictions provide 
a useful benchmark for assessing the ESV predictions. 

We generated seven sets of predictions with random 
perturbations. These, together with the unperturbed 



Fig. 4. (a) The MJO phase-averaged final ESV and (b) CEOF- 
filtered 10-day-forecasted initial ESV for all hindcast cases. The 
spatial pattern of ESV after 10-day lead forecast is calculated by 
the averaged difference between the 10-day lead forecast without 
any perturbation and that with positive ESV using the entire 
hindcast samples. Note that the values are normalized. 

predictions allow for 28 combinations of two mem- 
bers with which to compare the 2-member ensembles 
of the ESV predictions. For the 28 possible combi- 
nations of the CNTL predictions, the approximate 
upper 95%, and 99% confidence levels of the corre- 
lations are defined as second largest and largest values, 
respectively. Similarly, the lower 95% and 99% confi- 
dence levels are defined as the second lowest and lowest 
values, respectively. Note that the CNTL prediction skill 
is defined as the mean correlation of the 28 samples. 

b. MJO forecast results 

We begin by examining how well the linear operator 
captures the evolution of the initial ESVs in the non- 
linear integration. To do so, the final ESV that evolved 
using the linear operator is compared with the 10-day 
integration of the initial ESV using the CGCM. The 
spatial pattern of the ESV after 10 days is calculated 
by the averaged difference between the 10-day lead 
forecast with a positive ESV and that without any per- 
turbation using the entire hindcast sample, then the 
difference is CEOF-filtered using 2 dominant modes. 
Note that we also used 2 dominant CEOF modes to 
obtain final ESV. Figure 4 shows the final ESV and the 
CEOF-filtered 10-day forecast of the initial ESV using 
the CGCM. Note that final ESV shown in this figure 
is averaged over all MJO phases because the spatial 
pattern of ESVs is not sensitive to the MJO phases. In 
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(a) CNTL (b) ESV (c) ESV-CNTL 



Longitudes 

Fig. 5. The correlation skill of the equatorially averaged (15°S-15°N) 200-hPa zonal wind in the (a) CNTL and (b) ESV prediction, (c) 
The difference in the correlation skill between the ESV predictions and the CNTL predictions with 99% confidence level are shown. Note 
that the mean correlation of 28 samples of CNTL predictions is shown as CNTL prediction. 


the final ESV, the positive (negative) zonal wind ano- 
maly at 200 hPa over the Maritime Continent (South 
America) is clearly shown. Consistent with the spatial 
pattern of upper-level zonal wind, there is upper-level 
convergence over the western-central Pacific. This feature 
is well captured in the forecasted initial ESV using the 
CGCM. For example, the forecasted initial ESV also 
shows a positive (negative) zonal wind anomaly at 200 hPa 
(850 hPa) over the Maritime Continent, even though the 
peaks of anomalies are shifted to the west. In addition, 
the positive peak of velocity potential at 200 hPa over 
the western-central Pacific is also shown both in the final 
ESV and the forecasted initial ESV. This indicates that the 
evolution of the perturbation, calculated from the empir- 
ical linear operator, captures to some extent the evolution 
of the initial perturbation in the nonlinear model. 

To evaluate the impact of the ESV on the MJO fore- 
cast skill, the skill is compared with that of the CNTL 
predictions. Figure 5 shows the correlation skill of the 
unfiltered equatorial zonal wind at 200 hPa from the 
CNTL and ESV predictions. Note that the mean cor- 
relation of the 28 samples of the CNTL predictions is 
shown as CNTL prediction. In the CNTL predictions, 
the correlation skill of the unfiltered zonal wind at 
200 hPa drops below 0.5 after 12 days. After 20 days, the 
correlation skill drops below 0.2 over most regions. For 
the ESV predictions, the correlation skill show some 
significant improvements compared with the CNTL af- 
ter 10-day lead times over the central-eastern Pacific. 


We next turn to the forecast skill of the dominant 
CEOF-related fields. In this case (before computing the 
correlations), both the predicted and observed zonal 
wind anomalies at 200 hPa are spatially filtered to retain 
only the contributions from the five dominant observed 
CEOFs. Figure 6 shows the correlation skill of the 
CEOF-filtered equatorial zonal wind at 200 hPa. In the 
CNTL predictions, the correlation skill drops below 0.5 
after 15 days. In particular, the correlation skill over the 
eastern Pacific reaches 0.3 after 8 days, which indicates 
that the correlation skill over the eastern Pacific is a local 
minimum compared to the other regions. On the other 
hand, for the ESV predictions, the correlation skill over 
the eastern Pacific does not reach 0.3 until about 12 days. 
Also, the correlation skill over the Maritime Continent 
falls to 0.3 after 20 days in the ESV prediction, while in 
the CNTL prediction the correlations is already below 
0.2 at that forecast lead time. The difference in the 
correlation skill (Fig. 6c) shows that the correlation 
improvement is largest over the eastern Pacific and 
Maritime Continent with 99% confidence level. 

Figure 7 shows the correlation skill of a bivariate in- 
dex (Kang and Kim 2010; Rashid et al. 2011) along with 
confidence levels computed from the 28 different con- 
trol predictions. The value of bivariate correlation skill 
for the CNTL prediction drops to 0.5 at 16 days. Note 
that this prediction skill is higher than the autoregressive 
model introduced in Rashid et al. (2011), and it is similar 
to the prediction skill of the POAMA model with single 
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Longitudes 

Fig. 6. As in Fig. 5, but the anomaly is CEOF filtered. Note that five dominant CEOFs are used. 


ensemble member. In the ESV prediction, it is clear that 
there is systematic improvement in predicting the bi- 
variate index, showing that the bivariate correlation in 
ESV prediction is above 0.5 out to a lead time of 17 days. 
The improvement in the correlation at 15 days is 0.05 
with a 99% confidence level, implying that the en- 
semble prediction with the ESV is beneficial to MJO 
prediction. 

The correlation skill of the RMM1 and RMM2 indices 
is shown in Fig. 8. It is found that the skill improve- 
ment of the RMM2 index is slightly larger than that of 
RMM1 index. The correlation coefficient of the RMM2 
index in the ESV prediction is about 0.5 at 20 days, 
while the correlation of RMM2 index is 0.5 at 14 days. 
On the other hand, the correlation skill of the RMM1 
index in both the ESV and CNTL predictions reaches 
0.5 at day 17. 

A number of studies have examined the sensitivity of 
MJO forecast skill to the MJO amplitude (Jiang et al. 
2008; Kang and Kim 2010; Rashid et al. 2011). Figure 9 
shows, for example, the correlation skill of the bivariate 
index with respect to the initial MJO amplitude. Here, 
that the MJO amplitude is defined as ^/^(O) 2 + a 2 ( 0) 2 , 
where, a { (0) and a 2 (0) are the observed (i.e., MERRA) 
RMM1 and RMM2 at the initial time. We divided the 
total cases into two groups, one consists of the strong 
MJO cases when the initial MJO amplitude is larger 
than one, and the other consists of the weak MJO cases, 
when the initial MJO amplitude is smaller than one. In 
the CNTL prediction, the correlation skill of the bi- 
variate index for the strong MJO case is higher than that 


for the weak MJO case during the early period of the 
forecast. For example, the correlation skill for the strong 
MJO case is more than 0.8, while that for the weak MJO 
cases it is less than 0.7 at 5-day lead time. However, the 
correlation skill with respect to the MJO amplitude 
becomes similar after 15 days in the CNTL predictions. 
In the ESV predictions, the correlation at 15 days for 
the weak and strong MJO cases is about 0.55 and 0.5, 
respectively. 



Forecast Lead Day 

Fig. 7. (a) The bivariate correlation skill of the CNTL (gray line) 
and ESV (black line) predictions for all hindcast cases. The thin 
gray line denotes 95%, and 99% confidence level using samples of 
CNTL prediction results with different combination of random and 
no perturbations. 
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Fig. 8. The correlation skill of (a) RMM1 and (b) RMM2 index in the CNTL (gray) and ESV (black) predictions 
for all hindcast cases are shown. The thin gray line denotes 95%, and 99% confidence level using samples of CNTL 
prediction results with different combination of random and no perturbations. 


The correlation skill improvement in the ESV pre- 
dictions is much higher for the weak MJO case than that for 
the strong MJO case. For the strong MJO case, the forecast 
lead day at which point the correlation skill drops to 0.5 is 
about the same for the CNTL and ESV predictions, while 
for the weak MJO case, it is about 4 days longer in the ESV 
prediction. This shows that the ESV is especially beneficial 
in the weak MJO cases and suggests that the fast-growing 
perturbations are more effective for the less predictable 
periods when the forecast skill is sensitive to the uncer- 
tainty in the initial conditions. This is consistent with 
previous results based on fast-growing perturbations 
determined from both breeding and the singular-vector 
approach (Chen et al. 1997; Xue et al. 1997a, b; Cai et al. 
2003; Ham et al. 2009; Kug et al. 2010). 

Figure 10 shows the bivariate correlation skill in 
the CNTL and ESV predictions as a function of the phase 
of the MJO. In the CNTL prediction, the correlation skill 
is highest during MJO phases 5-8. For example, the cor- 
relation skill drops to 0.5 by about day 12 during phases 1- 
4, while during phases 5-8, the correlation remains above 
0.5 until day 17. At forecast lead times longer than 20 days, 
the CNTL correlation skill is lowest in MJO phases 3-7, 
while for the ESV predictions, the correlation skill shows 
significant improvements for those phases. The im- 
provement of the correlation skill is largest for the MJO 
phase 4, exceeding 0.25 between lead times between 20 
and 25 days. 

The spatial distribution of the skill improvements are 
highlighted in Fig. 11 in terms of the RMS error (RMSE) 


between the CEOF-filtered equatorial velocity potential 
at 200 hPa in the prediction experiments and the ob- 
servations for the cases when the initial MJO phase is 4. 
In the CNTL predictions, the RMSE is relatively high 
over the Indian Ocean, western Pacific, and Atlantic 
Ocean beyond a forecast lead of 10 days. The RMSE grows 



Forecast Lead Day 

Fig. 9. The bivariate correlation skill of the CNTL (dotted line) 
and ESV (solid line) predictions for the strong (gray) and weak 
(black) MJO cases are shown. The MJO amplitude is defined as the 
square root of the RMM1 plus RMM2 variance at the initial time. 
The strong (weak) MJO case when initial MJO amplitude is larger 
(smaller) than one. The thin line denotes 95% and 99% confidence 
level using samples of CNTL prediction results with different 
combination of random and no perturbations. 
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MJO Phases 

Fig. 10. The bivariate correlation skill of (a) CNTL and (b) ESV prediction with respect to the initial MJO phases, (c) The difference of 
the correlation skill in the ESV prediction from that in the CNTL prediction is shown. The x axis (y axis ) denotes the MJ O phases (forecast 
lead day). Note that the improvement over the 99% confidence level is only shaded in (c). 


to a value of about 1.4 over these regions in the CNTL 
prediction, indicating a climatological forecast (Rashid 
et al. 2011). However, compared to the CNTL prediction, 
there is systematic reduction of the RMSE in the ESV 
prediction. For example, the RMSE in the ESV pre- 
dictions during lead times of 5-15 days is less than 0.6 over 
the Atlantic Ocean, while that in the CNTL prediction is 
more than 0.7. In addition, the RMSE over the Indian 
Ocean is less than 1.4 during the entire forecast period with 
the ESV, indicating that the MJO forecast skill of the 
ESV-based predictions is better than a climatological 
forecast out to 30 days. Note that the RMSE reduction 
beyond 15-day lead times is primarily over those regions 
where the RMSE is a local maximum. For example, the 
reduction of the RMSE is greater than 0.4 during 20- 
25-day forecast lead times over the Indian ocean. 

To investigate the time evolution of MJO fields in the 
prediction experiments. Fig. 12 shows the composite of the 


CEOF-filtered 200-hPa velocity potential anomaly — 
again for the case where the MJO is initially in phase 4. 
In the observations (i.e., MERRA reanalysis), the neg- 
ative (positive) 200-hPa velocity potential anomaly is ro- 
bust over the Indo-Maritime Continent (far-eastern Pacific) 
during 1-5-day lead times. The observed divergence signal 
moves to the east, and the negative value of upper-level 
velocity potential (i.e., upper-level divergence) anomaly 
propagates from the Maritime Continent at lead times 
of 1-5 days to the far-eastern Pacific at lead times of 21- 
25 days. As the signal moves to the east, the magnitude 
of MJO-related anomaly gradually weakens. 

During the early phase of the forecast, the spatial 
pattern of the MJO-related anomaly is similar to the 
observed in both prediction experiments. However, in 
the CNTL prediction, the eastward propagation of the 
velocity potential anomaly is too weak, while the ob- 
served peak propagates to the east. In addition, zonal 
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Longitudes 

FIG. 11. The RMSE of the 200-hPa velocity potential at initial MJO phase 4 in the (a) CNTL and (b) ESV prediction, (c) The difference of 
the RMSE in the CNTL prediction from that in the ESV prediction is shown in panel. 


wavenumber-1 structure is damped out too quickly. For 
example, during 11-15-day lead forecast in CNTL pre- 
diction, there are two positive and negative peaks of the 
velocity potential over the equatorial regions. On the 
other hand, in the ESV prediction, the wavenumber-1 
structure is somewhat dominant until days 16-20. Being 
consistent with Fig. 11, the positive (negative) velocity 
potential anomaly over the Indian Ocean (far eastern 
Pacific) after an 11 -day lead forecast is also simulated 
better in the ESV prediction. 

4. Summary and discussions 

This study investigated the impact on forecast skill of 
optimal initial perturbations based on empirical singular 
vectors (ESVs), with a focus on the MJO during the boreal 
winter season. The ESV approach employs a reduced- 
space, linear approximation to the full nonlinear GEOS-5 
CGCM, computed from the statistics of a 10-yr (1990-99) 
hindcast dataset. It was found that the eastward evolution 
(over the Maritime Continent) of the ESV over the first 
10 days of the forecasts resembles aspects of the MJO and 
replicates the evolution produced in the fully nonlinear 
model integrations. 

ESV-based predictions were carried out with two 
ensemble members (± the ESV perturbation) for boreal 
winter season from 1990 to 1999. The forecast skill was 
compared to that of a control (CNTL) set of predictions 
in which the two-member ensemble means are based 
on predictions with random perturbations. It was shown 


that the prediction experiment with the ESV has a sys- 
tematically higher bivariate correlation skill compared 
to that with the random perturbations. In particular, 
the improvement of the correlation skill in the ESV 
prediction is greatest during the MJO phases 4-8, charac- 
terized by enhanced convective activity over the Mari- 
time Continent or western Pacific. During these phases 
the correlation skill in the CNTL prediction is lower 
than during the other MJO phases, indicating that the 
ESV perturbation approach is most effective during 
periods of low skill. Also, the improvement of bivariate 
correlation skill with the ESV is largest for weak MJO 
cases, when the skill improvement is considerably lower 
than for strong MJO cases. 

While the approach used here provides information 
about the spatial pattern of the optimal initial pertur- 
bation, the magnitude of the initial perturbation is not 
well constrained. In fact, the selection of the amplitude 
of the ESV is somewhat arbitrary and is analogous to the 
need to select a norm magnitude in the breeding ap- 
proach (Toth and Kalnay 1993; Ham et al. 2012). In this 
study, the magnitude of the ESV is determined as some 
fraction of the natural variability, however, further work 
is needed to determine whether there is an optimal 
magnitude for the initial perturbations. 

Even though we focused here on the single fastest- 
growing ESV, the number of available ESV modes is 
equal to the number of state vectors, so that in general, 
one can obtain multiple ESV perturbations. This is 
unlike the breeding approach, which is designed to 
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FIG. 12. The composite of 200-hPa velocity potential at initial MJO phase 4 in (left) the observations (i.e„ MERRA), (middle) CNTL 
prediction, and (right) ESV prediction with respect to the forecast lead days. 


generate a single fastest-growing perturbation. Further 
work is needed, however, to investigate the benefits of 
additional ESV perturbations. 

There is little question that an ensemble approach is 
essential to producing skillful predictions of the MJO 
with dynamical models. Progress in identifying opti- 
mal perturbations for subseasonal time scales has been 
slow at least in part because of the expense of running 
CGCMs with a large number of different initial condi- 
tions. This study shows that the ESV approach is a viable 
option for generating initial perturbations that reduce 
uncertainty in MJO predictions with a relatively small 
number of ensemble members. However, it is very likely 
that major advances in dynamically based predictions 
of the MJO will require a dual approach that addresses 
both perturbation strategies and model deficiencies in 
simulating the MJO. 
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